4,006 research outputs found

    {HyGen}: {G}enerating Random Graphs with Hyperbolic Communities

    No full text

    Boolean Matrix Factorization Meets Consecutive Ones Property

    No full text
    Boolean matrix factorization is a natural and a popular technique for summarizing binary matrices. In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF). A major application of this optimization problem comes from graph visualization: standard techniques for visualizing graphs are circular or linear layout, where nodes are ordered in circle or on a line. A common problem with visualizing graphs is clutter due to too many edges. The standard approach to deal with this is to bundle edges together and represent them as ribbon. We also show that we can use OBMF for edge bundling combined with circular or linear layout techniques. We demonstrate that not only this problem is NP-hard but we cannot have a polynomial-time algorithm that yields a multiplicative approximation guarantee (unless P = NP). On the positive side, we develop a greedy algorithm where at each step we look for the best 1-rank factorization. Since even obtaining 1-rank factorization is NP-hard, we propose an iterative algorithm where we fix one side and and find the other, reverse the roles, and repeat. We show that this step can be done in linear time using pq-trees. We also extend the problem to cyclic ones property and symmetric factorizations. Our experiments show that our algorithms find high-quality factorizations and scale well

    {MDL4BMF}: Minimum Description Length for Boolean Matrix Factorization

    No full text
    Matrix factorizationsā€”where a given data matrix is approximated by a prod- uct of two or more factor matricesā€”are powerful data mining tools. Among other tasks, matrix factorizations are often used to separate global structure from noise. This, however, requires solving the ā€˜model order selection problemā€™ of determining where fine-grained structure stops, and noise starts, i.e., what is the proper size of the factor matrices. Boolean matrix factorization (BMF)ā€”where data, factors, and matrix product are Booleanā€”has received increased attention from the data mining community in recent years. The technique has desirable properties, such as high interpretability and natural sparsity. However, so far no method for selecting the correct model order for BMF has been available. In this paper we propose to use the Minimum Description Length (MDL) principle for this task. Besides solving the problem, this well-founded approach has numerous benefits, e.g., it is automatic, does not require a likelihood function, is fast, and, as experiments show, is highly accurate. We formulate the description length function for BMF in generalā€”making it applicable for any BMF algorithm. We discuss how to construct an appropriate encoding, starting from a simple and intuitive approach, we arrive at a highly efficient data-to-model based encoding for BMF. We extend an existing algorithm for BMF to use MDL to identify the best Boolean matrix factorization, analyze the complexity of the problem, and perform an extensive experimental evaluation to study its behavior

    The Mandatory Forest Certification Scheme as a Tool for Sustainable Forest Management in Russia

    Get PDF
    The Certification Law in the Russian Federation regulates both voluntary and mandatory forest certification. The Mandatory Forest Certification Scheme (MFCS) was developed observing the principles, criteria and indicators of the Helsinki and Montreal processes, as well as the Russian list of criteria and indicators. Also the principles of the Forest Stewardship Council and the International Organization for Standardization Standard 14001 were used as reference. The scheme has been tested in five regions, and an auditing of a large North-American forest company will be carried out during the summer of 2001 in Karelia. The mandatory scheme differs in some respects from the certification systems developed elsewhere. One of the major distinguishing features is that the set of criteria are presented in the form of 24 normative documents, including the Forest Code. In addition, the applicant of the MFCS certificate is the forest user, instead of the forest owner, which is the state in the Russian Federation. The scheme is aimed to cover the ecological, economical, social and cultural aspects of sustainable forestry, and an independent certification body issues the certificate. The scheme includes third party auditing and provides the possibility for the state or public organizations to supervise forest loggings, and request non-scheduled auditing from the Forest Certification Center if deemed necessary. The scheme is aimed to complement the Helsinki and Montreal processes by putting the general forest policy into action at the operational level in the leskhozes

    Distributed Performance Analysis on the Internet Using a Centric Database

    Get PDF
    In many areas of life it is useful to be able to compare one's own performance to some general benchmark data. The Internet provides a way of realizing such a comparison so that the original database can be hidden from users by locating it in a server computer and users can test their individual data in a distributed manner. An interactive and graphical user interface can be implemented with the tools of the World-Wide Web (WWW). We introduce a World-wide INTEractive Regression Analysis (WINTERA) system that operates via the Internet. The system enables a user to carry out regression analysis with an original database and evaluate the performance of the data vector of her or his own. There are two kinds of users in the system. Data suppliers enter their observation matrices to form databases. Ordinary users can evaluate observations of their own with respect to existing databases. They can also suggest their observations to be included in the databases. the data supplier decides whether (s)he accepts or rejects the information. This means that the whole database is accessible only to the data supplier. In any case, ordinary users receive information about their performance

    Clustering {Boolean} Tensors

    Get PDF
    Tensor factorizations are computationally hard problems, and in particular, are often significantly harder than their matrix counterparts. In case of Boolean tensor factorizations -- where the input tensor and all the factors are required to be binary and we use Boolean algebra -- much of that hardness comes from the possibility of overlapping components. Yet, in many applications we are perfectly happy to partition at least one of the modes. In this paper we investigate what consequences does this partitioning have on the computational complexity of the Boolean tensor factorizations and present a new algorithm for the resulting clustering problem. This algorithm can alternatively be seen as a particularly regularized clustering algorithm that can handle extremely high-dimensional observations. We analyse our algorithms with the goal of maximizing the similarity and argue that this is more meaningful than minimizing the dissimilarity. As a by-product we obtain a PTAS and an efficient 0.828-approximation algorithm for rank-1 binary factorizations. Our algorithm for Boolean tensor clustering achieves high scalability, high similarity, and good generalization to unseen data with both synthetic and real-world data sets

    An ALMA survey of submillimetre galaxies in the COSMOS field: Physical properties derived from energy balance spectral energy distribution modelling

    Get PDF
    Context. Submillimetre galaxies (SMGs) represent an important source population in the origin and cosmic evolution of the most massive galaxies. Hence, it is imperative to place firm constraints on the fundamental physical properties of large samples of SMGs. Aims. We determine the physical properties of a sample of SMGs in the COSMOS field that were pre-selected at the observed-frame wavelength of Ī»_(obs) = 1.1 mm, and followed up at Ī»_(obs) = 1.3 mm with the Atacama Large Millimetre/submillimetre Array (ALMA). Methods. We used the MAGPHYS model package to fit the panchromatic (ultraviolet to radio) spectral energy distributions (SEDs) of 124 of the target SMGs, which lie at a median redshift of z = 2.30 (19.4% are spectroscopically confirmed). The SED analysis was complemented by estimating the gas masses of the SMGs by using the Ī»_(obs) = 1.3 mm dust emission as a tracer of the molecular gas component. Results. The sample median and 16thā€“84th percentile ranges of the stellar masses, obscured star formation rates, dust temperatures, and dust and gas masses were derived to be log(Mā‹†/MāŠ™)ā€Æ=ā€Æ11.09^(+0.41)_(-0.53), SFRā€Æ=ā€Æ402^(+661)_(-233) MāŠ™ yr^(-1), T_(dust)ā€Æ=ā€Æ39.7^(+9.7)_(-7.4) K, log(M_(dust)/MāŠ™)ā€Æ=ā€Æ9.01^(+0.20)_(-0.31), and log(M_(gas)/MāŠ™ā€Æ=ā€Æ11.34^(+0.20)_(-0.23), respectively. The M_(dust)/Mā‹† ratio was found to decrease as a function of redshift, while the M_(gas)/M_(dust) ratio shows the opposite, positive correlation with redshift. The derived median gas-to-dust ratio of 120^(+73)_(-30) agrees well with the canonical expectation. The gas fraction (M_(gas)/(M_(gas) + Mā‹†)) was found to range from 0.10 to 0.98 with a median of 0.62^(+0.27)_(-0.23). We found that 57.3% of our SMGs populate the main sequence (MS) of star-forming galaxies, while 41.9% of the sources lie above the MS by a factor of greater than three (one source lies below the MS). These super-MS objects, or starbursts, are preferentially found at z ā‰³ 3, which likely reflects the sensitivity limit of our source selection. We estimated that the median gas consumption timescale for our SMGs is ~535 Myr, and the super-MS sources appear to consume their gas reservoir faster than their MS counterparts. We found no obvious stellar massā€“size correlations for our SMGs, where the sizes were measured in the observed-frame 3 GHz radio emission and rest-frame UV. However, the largest 3 GHz radio sizes are found among the MS sources. Those SMGs that appear irregular in the rest-frame UV are predominantly starbursts, while the MS SMGs are mostly disk-like. Conclusions. The physical parameter distributions of our SMGs and those of the equally bright, 870 Ī¼m selected SMGs in the ECDFS field (the so-called ALESS SMGs) are unlikely to be drawn from common parent distributions. This might reflect the difference in the pre-selection wavelength. Albeit being partly a selection bias, the abrupt jump in specific SFR and the offset from the MS of our SMGs at z ā‰³ 3 might also reflect a more efficient accretion from the cosmic gas streams, higher incidence of gas-rich major mergers, or higher star formation efficiency at z ā‰³ 3. We found a rather flat average trend between the SFR and dust mass, but a positive SFRāˆ’M_(gas) correlation. However, to address the questions of which star formation law(s) our SMGs follow, and how they compare with the Kennicutt-Schmidt law, the dust-emitting sizes of our sources need to be measured. Nonetheless, the larger radio-emitting sizes of the MS SMGs compared to starbursts is a likely indication of their more widespread, less intense star formation activity. The irregular rest-frame UV morphologies of the starburst SMGs are likely to echo their merger nature. The current stellar mass content of the studied SMGs is very high, so they must quench to form the so-called red-and-dead massive ellipticals. Our results suggest that the transition from high-z SMGs to local ellipticals via compact, quiescent galaxies (cQGs) at z ~ 2 might not be universal, and the latter population might also descend from the so-called blue nuggets. However, z ā‰³ 4 SMGs could be the progenitors of higher redshift, z ā‰³ 3 cQGs, while our results are also consistent with the possibility that ultra-massive early-type galaxies found at 1.2 ā‰² z ā‰² 2 experienced an SMG phase at z ā‰¤ 3

    An ALMA survey of submillimetre galaxies in the COSMOS field: The extent of the radio-emitting region revealed by 3 GHz imaging with the Very Large Array

    Get PDF
    Context. The observed spatial scale of the radio continuum emission from star-forming galaxies can be used to investigate the spatial extent of active star formation, constrain the importance of cosmic-ray transport, and examine the effects of galaxy interactions. Aims. We determine the radio size distribution of a large sample of 152 submillimetre galaxies (SMGs) in the COSMOS field that were pre-selected at 1.1 mm, and later detected with the Atacama Large Millimetre/submillimetre Array (ALMA) in the observed-frame 1.3 mm dust continuum emission at a signal-to-noise ratio (S/N) of ā‰„5. Methods. We used the deep, subarcsecond-resolution (1Ļƒ = 2.3Ī¼Jy beam^(-1); .Ģ‹ 75) centimetre radio continuum observations taken by the Karl G. Jansky Very Large Array (VLA)-COSMOS 3 GHz Large Project. Results. One hundred and fifteen of the 152 target SMGs (76% Ā± 7%) were found to have a 3 GHz counterpart (ā‰„ 4.2Ļƒ), which renders the radio detection rate notably high. The median value of the deconvolved major axis full width at half maximum (FWHM) size at 3 GHz is derived to be 0.Ģ‹59 Ā± 0.Ģ‹05 , or 4.6 Ā± 0.4 kpc in physical units, where the median redshift of the sources is z = 2.23 Ā± 0.13 (23% are spectroscopic and 77% are photometric values). The radio sizes are roughly log-normally distributed, and they show no evolutionary trend with redshift, or difference between different galaxy morphologies. We also derived the spectral indices between 1.4 and 3 GHz, and 3 GHz brightness temperatures for the sources, and the median values were found to be Ī±_(1.4 GHz)^(3 GHz) = -0.67 (S_Ī½ āˆ Ī½^Ī±) and T_B = 12.6 Ā± 2 K. Three of the target SMGs, which are also detected with the Very Long Baseline Array (VLBA) at 1.4 GHz (AzTEC/C24b, 61, and 77a), show clearly higher brightness temperatures than the typical values, reaching T_B(3 GHz) > 10^(4.03) K for AzTEC/C61. Conclusions. The derived median radio spectral index agrees with a value expected for optically thin non-thermal synchrotron radiation, and the low median 3 GHz brightness temperature shows that the observed radio emission is predominantly powered by star formation and supernova activity. However, our results provide a strong indication of the presence of an active galactic nucleus in the VLBA and X-ray-detected SMG AzTEC/C61 (high TB and an inverted radio spectrum). The median radio-emitting size we have derived is ~ 1.5ā€“3 times larger than the typical far-infrared dust-emitting sizes of SMGs, but similar to that of the SMGsā€™ molecular gas component traced through mid-J line emission of carbon monoxide. The physical conditions of SMGs probably render the diffusion of cosmic-ray electrons inefficient, and hence an unlikely process to lead to the observed extended radio sizes. Instead, our results point towards a scenario where SMGs are driven by galaxy interactions and mergers. Besides triggering vigorous starbursts, galaxy collisions can also pull out the magnetised fluids from the interacting disks, and give rise to a taffy-like synchrotron-emitting bridge. This provides an explanation for the spatially extended radio emission of SMGs, and can also cause a deviation from the well-known infrared-radio correlation owing to an excess radio emission. Nevertheless, further high-resolution observations are required to examine the other potential reasons for the very compact dust-emitting sizes of SMGs, such as the radial dust temperature and metallicity gradients

    Mevalonate pathway regulates cell size homeostasis and proteostasis through autophagy

    Get PDF
    SummaryBalance between cell growth and proliferation determines cell size homeostasis, but little is known about how metabolic pathways are involved in the maintenance of this balance. Here, we perform a screen with a library of clinically used drug molecules for their effects on cell size. We find that statins, inhibitors of the mevalonate pathway, reduce cell proliferation and increase cell size and cellular protein density in various cell types, including primary human cells. Mevalonate pathway effects on cell size and protein density are mediated through geranylgeranylation of the small GTPase RAB11, which is required for basal autophagic flux. Our results identify the mevalonate pathway as a metabolic regulator of autophagy and expose a paradox in the regulation of cell size and proteostasis, where inhibition of an anabolic pathway can causeĀ an increase in cell size and cellular protein density
    • ā€¦
    corecore